Fundamentals of ETL Service Architecture
ETL service comprises of two parts: Staging engine and Storage Service. Staging engine manages staging process for all data received from several source systems. It interfaces with the AWB scheduler and monitor for scheduling and monitoring data load processes. However, Storage Service manages and provides access to data targets in SAP BW and the aggregates that are stored in relational and multidimensional database management systems.
It is true, however, that the extraction technology provided as an integral part of SAP BW is restricted to database management systems supported by mySAP technology and that it does not allow extracting data from other database systems like IBM IMS and Sybase. It also does not support proprietary file formats such as dBase file formats, Microsoft Access file formats, Microsoft Excel file formats, and others. On the other hand, the ETL services layer of SAP BW provides all the functionality required to load data from non-SAP systems in exactly the same way as it does for data from SAP systems. SAP BW does not in fact distinguish between different types of source systems after data has arrived in the staging area. The ETL services layer provides open interfaces for loading non-SAP data.
Extraction at Service Levels
SAP BW can be integrated with other SAP components based on application programming interface (API) service. It provides a framework to enable comprehensive data replication based on data extractors that encapsulate the application logic. Data Extractor fills the extract structure of data source with a data from data source and offers sophisticated handling of changes. In addition to supporting extractors, the service APIs also enable online access via RemoteCube technology and flexible staging for hierarchies. On the other hand SAP provides an open interface called Staging Business Application Programming Interface (BAPI) to extract data from non-SAP sources. BAPI serves the purpose of connecting third- party ETL tools to SAP BW and provides access to SAP BW objects which facilitates use of customer extraction routines. Data can be extracted at the database level by using: DB connect, flat files and XML. DB connect facilitates extraction directly from DBMS. In this the metadata files are loaded by replicating metadata tables and views into the metadatory repository of SAP BW. Data can also be uploaded from flat files by creating routines for extraction of data and XML files can be extracted through XML via Administrator Workbench in SAP BW.
Components of ETL Services at Database of File Level
Operational Data Store: It stores detailed data and supports tactical, day-to-day decision making. A SAP view ODS as a near real-time informational environment that supports operational reporting by interacting with existing transactional systems, data warehouses, or analytical applications. SAP BW allows flexible access to data in the ODS, the data warehouse, and the multidimensional models.
Data Marts: A data mart provides the data needed by a decentralized function, department, or business area. You need to weight the pros and cons before developing a data mart. For example, a data mart can be implemented faster and cheaper than a data warehouse, sometimes costing 80% less than a full data warehouse. But as data marts proliferate, the cost advantages can disappear. The IT organization must maintain the individual data marts and the multitude of ETL and warehouse management processes that go with them. Multiple data marts can complicate data integration efforts, increase the amount of inconsistent data, require more business rules, and create the data stovepipes that data warehousing strives to eliminate.
Interfaces: The data mart interface enables users to transfer and update transactional data and metadata from one SAP BW system to other SAP BW systems.
Open Hub Services: The open hub service is used to share data in SAP BW with non-SAP data marts, analytical applications, and other applications. This service controls data distribution and maintains data consistency across systems. With the open hub service, actual data and the corresponding metadata are retrieved from InfoCubes or ODS objects.
Understanding the role of storage services layers in architectural model
Master data manager: Master Data Manager generates the master data infrastructure containing master data tables as well as master data update and the retrieval routines. It also maintains master data and provides access to master data for use by SAP BW reporting components and for exporting to other data warehouse services for analysis and access services.
ODS Manager: ODS manager generates ODS data object infrastructure. It maintains an active data table for maintaining ODS object data, a change log for every update applied to the ODS object data as part of application process and provides access to ODS object data for SAP BW reporting and analysis functionality.
Archiving Manager: The Archiving Manager stores unused, dormant data in an archive with the help of Archive Development Kit (ADK). ADK is connected to the SAP BW via Archiving Manager. It also keeps track of relevant metadata such as Infocubes and ODS objects which possibly will change over time.
InfoCube Manager: It serves the function of generating the InfoCube Meta tables. It maintains InfoCube data tables and provides access to InfoCube data tables for SAP BW reporting and analysis.
|